Baseline correction

This example will look at the first step in TAP data preprocessing: the baseline correction. The baseline indicates where in time of the flux that the measurement of the molecules should be zero. This occurs either before a pulse has been initiated or after a certain time when all of the gas has diffused out of the reactor. However, due to instrument drift and ionization effects, the outlet response may exhibit a shift in the voltage overall time points. It is assumed that this shift is constant over time and hence we do not account for a non-linear baseline.

Below is an example of a flux that has a baseline shift:

Traditional baseline correction

The traditional method of baseline correction is to determine where in time the flux is not changing (slope of zero) and taking the average of the baseline points. For example, visual inspection shows that the baseline can be approximated from 4.5 to 5 seconds where the mean of baseline should be approximately 0.3. This mean is verified below:

The baseline mean is infact approximately 0.3. In tapsap, the baseline_correction function can also reproduce the effects. The function can either take a baseline time range, a baseline amount, or even neither where the last 95% of the points will be taken as the baseline. The output of the function returns a dictionary containing the 'flux' and the 'baseline_amount' Examples of each are given below by plotting the baseline corrected flux.

Baseline correction via the Gamma distribution

Sometimes the flux does not reach a baseline. Slow reactions or molecules sticking to the inert material may result in a flux that does not have a baseline. To address this, an approximation of the flux using a statistical distribution of the molecules with respect time can be used. More specifically, the Gamma distribution is used as it is connected to the velocity of molecules in Knudsen diffusion and a series of CSTR reactors. This method first determines the approximate Gamma distribution from the peak residence time (the time of the maximum of the flux, approximately 0.2 seconds in this example) and then uses the area of the gamma distribution to correct the area of the flux. This function (baseline_gamma) does not require any other inputs beyond the flux and the time.

The baseline amount provided by the baseline_gamma is not quite what is expected (0.26 to 0.3). This could indicate a that either the reaction is not complete or that peak residence time cannot be accurately measured due to the noise. With that in mind, the process is repeated, but using clean version of the flux:

Smoothing the flux resulted in the baseline amount being closer to the initial value of 0.3 without any estimation of where the baseline starts and ends.

Application to a Transient object

The above code can be applied to each value of the dataframe, but can also be used in the method of the transient class called baseline_correct. The baseline_correct method takes the argumens baseline_time_range (a list of start and end time), baseline_amount (a float on the correction amount) and smooth_flux (smoothing the flux for baseline_gamma). The difference between this method and the baseline_correction function is that if the baseline_time_range nor the baseline amount are indicated, then the baseline_gamma function will be used. The example below processes the data using a given time range.